Voice assistants are deployed widely and provide useful functionality. However, recent work has shown that commercial systems like Amazon Alexa and Google Home are vulnerable to voice-based confusion attacks that exploit design issues. We propose a systems-oriented defense against this class of attacks and demonstrate its functionality for Amazon Alexa. We ensure that only the skills a user intends execute in response to voice commands. Our key insight is that we can interpret a user's intentions by analyzing their activity on counterpart systems of the web and smartphones. For example, the Lyft ride-sharing Alexa skill has an Android app and a website. Our work shows how information from counterpart apps can help reduce dis-ambiguities in the skill invocation process. We build SkilIFence, a browser extension that existing voice assistant users can install to ensure that only legitimate skills run in response to their commands. Using real user data from MTurk (N = 116) and experimental trials involving synthetic and organic speech, we show that SkillFence provides a balance between usability and security by securing 90.83% of skills that a user will need with a False acceptance rate of 19.83%.
translated by 谷歌翻译
Concept bottleneck models (CBMs) (Koh et al. 2020) are interpretable neural networks that first predict labels for human-interpretable concepts relevant to the prediction task, and then predict the final label based on the concept label predictions.We extend CBMs to interactive prediction settings where the model can query a human collaborator for the label to some concepts. We develop an interaction policy that, at prediction time, chooses which concepts to request a label for so as to maximally improve the final prediction. We demonstrate thata simple policy combining concept prediction uncertainty and influence of the concept on the final prediction achieves strong performance and outperforms a static approach proposed in Koh et al. (2020) as well as active feature acquisition methods proposed in the literature. We show that the interactiveCBM can achieve accuracy gains of 5-10% with only 5 interactions over competitive baselines on the Caltech-UCSDBirds, CheXpert and OAI datasets.
translated by 谷歌翻译
Abusive language is a concerning problem in online social media. Past research on detecting abusive language covers different platforms, languages, demographies, etc. However, models trained using these datasets do not perform well in cross-domain evaluation settings. To overcome this, a common strategy is to use a few samples from the target domain to train models to get better performance in that domain (cross-domain few-shot training). However, this might cause the models to overfit the artefacts of those samples. A compelling solution could be to guide the models toward rationales, i.e., spans of text that justify the text's label. This method has been found to improve model performance in the in-domain setting across various NLP tasks. In this paper, we propose RAFT (Rationale Adaptor for Few-shoT classification) for abusive language detection. We first build a multitask learning setup to jointly learn rationales, targets, and labels, and find a significant improvement of 6% macro F1 on the rationale detection task over training solely rationale classifiers. We introduce two rationale-integrated BERT-based architectures (the RAFT models) and evaluate our systems over five different abusive language datasets, finding that in the few-shot classification setting, RAFT-based models outperform baseline models by about 7% in macro F1 scores and perform competitively to models finetuned on other source domains. Furthermore, RAFT-based models outperform LIME/SHAP-based approaches in terms of plausibility and are close in performance in terms of faithfulness.
translated by 谷歌翻译
原则上,将变异自动编码器(VAE)应用于顺序数据提供了一种用于控制序列生成,操纵和结构化表示学习的方法。但是,训练序列VAE具有挑战性:自回归解码器通常可以解释数据而无需使用潜在空间,即后置倒塌。为了减轻这种情况,最新的模型通过将均匀的随机辍学量应用于解码器输入来削弱强大的解码器。从理论上讲,我们表明,这可以消除解码器输入提供的点式互信息,该信息通过利用潜在空间来补偿。然后,我们提出了一种对抗性训练策略,以实现基于信息的随机辍学。与标准文本基准数据集上的均匀辍学相比,我们的目标方法同时提高了序列建模性能和潜在空间中捕获的信息。
translated by 谷歌翻译
可靠的异常检测对于深度学习模型的现实应用至关重要。深层生成模型产生的可能性虽然进行了广泛的研究,但仍被认为是对异常检测的不切实际的。一方面,深层生成模型的可能性很容易被低级输入统计数据偏差。其次,许多用于纠正这些偏见的解决方案在计算上是昂贵的,或者对复杂的天然数据集的推广不佳。在这里,我们使用最先进的深度自回归模型探索离群值检测:PixelCNN ++。我们表明,PixelCNN ++的偏见主要来自基于局部依赖性的预测。我们提出了两个我们称为“震动”和“搅拌”的徒转化家族,它们可以改善低水平的偏见并隔离长期依赖性对PixelCNN ++可能性的贡献。这些转换在计算上是便宜的,并且在评估时很容易应用。我们使用五个灰度和六个自然图像数据集对我们的方法进行了广泛的评估,并表明它们达到或超过了最新的离群检测性能。总而言之,轻巧的补救措施足以在具有深层生成模型的图像上实现强大的离群检测。
translated by 谷歌翻译
经过培训的模拟静态数据集的冷冻模型永远无法提高其性能。可以采用互联网进行互联网以获取最新信息并在部署期间从人类那里获得反馈的模型提供了适应新信息并提高其性能的承诺。在这项工作中,我们研究了如何在此类学习框架中提高以互联网为导向的对话技能。我们收集人类互动的部署数据,并公开可用,并收集各种类型的人类反馈 - 包括二进制质量测量,自由形式的文本反馈和罚款良好的失败原因。然后,我们研究了各种从此类反馈中改进的算法,包括标准监督学习,拒绝抽样,模型引导和基于奖励的学习,以便对哪种类型的反馈和算法效果最好。我们发现最近介绍的导演模型(Arora等人,'22)比其他现有方法显示出显着改善。
translated by 谷歌翻译
我们提出了Blenderbot 3,这是一个175B参数对话模型,能够通过访问Internet和长期内存进行开放域对话,并接受了大量用户定义的任务的培训。我们同时发布了模型权重和代码,还将模型部署在公共网页上,以与有机用户进行交互。该技术报告描述了该模型的构建方式(建筑,模型和培训计划)以及其部署的细节,包括安全机制。人类评估表明,它优于现有的开放域对话代理,包括其前身(Roller等,2021; Komeili等,2022)。最后,我们使用部署收集的数据详细介绍了持续学习的计划,该数据也将公开发布。因此,该研究计划的目标是使社区能够研究通过互动学习的不断改进的负责任的代理商。
translated by 谷歌翻译
极端分类(XC)试图用最大的标签集中标记标签的子集标记数据点。通过使用稀疏,手工制作的功能的XC方法优越,用密集,学习的数据来进行深度XC,以数据点和标签的形式吸引了很多关注。负挖掘技术已成为所有深XC方法的关键组成部分,使它们可以扩展到数百万个标签。然而,尽管最近进步,但培训具有大型编码器体系结构(例如变形金刚)的深入XC模型仍然具有挑战性。本文确定,流行负面挖掘技术的内存通常迫使小型批量尺寸保持小且缓慢的训练。作为回应,本文介绍了Ngame,这是一种轻巧的迷你批次创建技术,可证明可证明准确的内部负面样品。这使得与现有负面采样技术相比,具有更大的迷你批次培训,提供更快的收敛性和更高的精度。发现Ngame的准确性比各种基准数据集的最先进方法要高16%,以进行极端分类,并且在回答搜索引擎查询以响应用户网页时检索搜索引擎查询更准确3%显示个性化广告。在流行搜索引擎的实时A/B测试中,Ngame在点击率率中的收益最高可达23%。
translated by 谷歌翻译
我们建议对视觉模型预处理的基于利润的损失,以鼓励基于梯度的解释,这些解释与区域级注释一致。我们将该目标称为注意面罩的一致性(AMC),并证明它与依赖于区域级注释的模型相比,它产生了卓越的视觉接地性能,以显式训练对象检测器,例如更快的R-CNN。 AMC通过鼓励基于梯度的解释掩盖来工作,该掩盖的注意力分数主要集中在包含这种注释的图像的注释区域中。尤其是,在标准视觉建模目标之上接受AMC训练的模型在FlickR30K视觉接地基准中获得了86.59%的最新精度,与最佳先前模型相比,绝对改善了5.48%。我们的方法在既定的基准中都表现出表达理解,并通过设计基于梯度的解释来更好地与人类注释保持一致,从而提供了极大的表现。
translated by 谷歌翻译
我们考虑对表格数据的自我监督表示学习(SSL)的任务:表格-SSL。典型的基于学习的SSL方法需要实例数据增强,这对于非结构化表格数据很难设计。现有的表格SSL方法以相对临时的方式设计这种增强性,并且无法捕获基础数据歧管。我们提出了一种新的基于重建的方法,而不是针对表格SSL的基于增强的方法,称为表格数据(MET),不需要增强。 MET基于视觉-SSL的流行MAE方法[He等,2021],并使用两个关键想法:(i)由于表格数据集中的每个坐标都具有独特的含义,因此我们需要为所有坐标使用单独的表示形式,(ii)除了标准损失外,还使用对抗性重建损失。五个不同表格数据集的经验结果表明,MET在所有这些数据集上实现了新的最新技术(SOTA),并且比当前的SOTA方法提高了9%。我们通过实验在精心设计的简单数据集上进行了更多的启示。
translated by 谷歌翻译